Tied-Mixture Language Modeling in Continuous Space

نویسندگان

  • Ruhi Sarikaya
  • Mohamed Afify
  • Brian Kingsbury
چکیده

This paper presents a new perspective to the language modeling problem by moving the word representations and modeling into the continuous space. In a previous work we introduced Gaussian-Mixture Language Model (GMLM) and presented some initial experiments. Here, we propose Tied-Mixture Language Model (TMLM), which does not have the model parameter estimation problems that GMLM has. TMLM provides a great deal of parameter tying across words, hence achieves robust parameter estimation. As such, TMLM can estimate the probability of any word that has as few as two occurrences in the training data. The speech recognition experiments with the TMLM show improvement over the word trigram model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segment-Based Acoustic Models for Continuous Speech Recognition

ity or acoustic observations conditioned on the state in Tied-mixture (or semi-continuous) distributions are an imhidden-Markov models (11MM), or for the case of the portant tool for acoustic modeling, used in many highSSM, conditioned on a region of the model. Some of the performance speech recognition systems today. This paper options that have been investigated include discrete dispiovides a...

متن کامل

On The Use Of Tied-Mixture Distributions

Tied-mixture (or semi-continuous) distributions are an important tool for acoustic modeling, used in many highperformance speech recognition systems today. This paper provides a survey of the work in this area, outlining the different options available for tied mixture modeling, introducing algorithms for reducing training time, and providing experimental results assessing the trade-offs for sp...

متن کامل

Improved context-dependent acoustic modeling for continuous Chinese speech recognition

This paper describes the new framework of context-dependent (CD) Initial/Final (IF) acoustic modeling using the decision tree based state tying for continuous Chinese speech recognition. The Extended Initial/Final (XIF) set is chosen as the basic speech recognition unit (SRU) set according to the Chinese language characteristics, which outperforms the standard IF set. An adaptive mixture increa...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Phonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition

This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture techniqu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009